Comments for MEDB 5501, Week 6

Assessing normality

  • Problems caused by non-normality
    • Poor confidence intervals, hypothesis tests
      • Too much imprecision
      • Poor coverage probability
        • Especially for one tailed tests
    • Inability to extrapolate
  • What about the Central Limit Theorem?

How to handle non-normality

  • Ignore it
    • Central Limit Theorem
  • Transform your data
  • Use alternatives
    • Nonparametric tests (covered in a later module)
    • Bootstrap (covered in a later module)
    • Randomization tests (not covered in this class)

Normal histogram with n=100

Normal histogram with n=1,000

Normal histogram with n=10,000

Normal histogram with n=100,000

Normal histogram with n=1,000,000

Normal distribution (n=infinity)

Normal(2, 1)

Normal(-1, 1)

Normal(0, 2)

Normal(0, 0.5)

Skewed right

Right skewness is characterized by the tails of the distribution

  • Heavy right tail
    • Greater tendency to produce extreme values on the right
  • Light left tail
    • Lesser tendency to produce extreme values on the left
  • Right skewness is the most common type of non-normality

Normal probability plot

  • Compare data to evenly spaced percentiles of the normal distribution
  • Example with n=4
    • Compare smallest value with \(Z_{0.2}\)
    • Compare next value with \(Z_{0.4}\)
    • Compare next value with \(Z_{0.6}\)
    • Compare largest value with \(Z_{0.8}\)
  • No best definition for evenly spaced
    • 12.5, 37.5, 62.5, 87.5, for example

Histogram for right skewed data

Histogram for left skewed data

Histogram for heavy tailed data

Histogram for light tailed data

Histogram for bimodal data

Histogram for normal data